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^^ PREFACE 

Тніѕ little book had its origin in three Special University 
Lectures given in the Beveridge Hall of the Senate House, 
London University, in January 1953. Those lectures were 
addressed to students of.psychology, and gave a geometrical 
approach to psychometric problems ; but it is hoped that these 
pages may interest a wider audience, for many of the formule 
have been applied, or are applicable, in other sciences. The 
greater part of the book requires only a very meagre mathe- 
matical equipment from the reader, little more than the 
knowledge that the square on the hypotenuse of a right- 
angled triangle equals the sum of the squares on the sides, the 
meaning of the cosine and sine of an angle, and two simple 
trigonometrical formule. Other assumptions occur but are 
explained, and the later part of Section Eight is indicated as 
meant for a more advanced reader, and can be omitted. 

What is demanded from the reader is a willingness to 
think spatially, even in terms of spaces of higher dimensions 
than three. Indeed, these pages may well form an introduction, 
for a budding mathematician, to the later more strenuous’ 


study of n-dimensional geometry. 
GODFREY THOMSON 


EDINBURGH 
March 1953 
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SECTION ONE 
THE MODEL 


THE primary purpose of this book is to describe a geometrical 
model from which can be deduced most of the formule used 
in the factorial analysis of human ability. Since many of these 
are, however, also used in other fields (for example the formula 
concerning selection, and partial regression co-efficients), it is 
hoped that the book may have a wider appeal. 
The model is one in many dimensions, but it is best 
approached by dealing with its simpler forms first, in two 
and in three dimensions. Consider first the ordinary 'scatter- 
gram’ illustrating the correlation between two variates, the 
Опе variate being measured along the x axis, the other along 
the y axis at right angles to the former, and each associated 
pair of values being represented by its point (x, y). If each 
Variate is distributed normally, these points will vary in 
density over the diagram, being thickest at the position 
corresponding to the average of both variates and having 
elliptical density contours round that point. If, further, each 
variate is measured from its average, and in units of its own 
standard deviation, the contours of density of the points will 
be ellipses 
x?-2rxy +y? =constant (1) 

f these ellipses will be equally inclined to 


and the major axis 0 REC 
# The correlation is measured by the 


the co-ordinate axes 
quantity r (see Figure 1). ч 
This quantity r can be calculated from the paired values of 
x and y by the formula 
Sum (xy) 


"Sum EST 
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no," = Sum (х?) 
апда Corresponding formula for 9; If the standard деш 
tions are taken as units in measuring x and y the previou 
formula becomes simply 


r=Sum (xy)/n 


or 
MEUM CE e 


is a distribution fulfilling these two requirements, but a normal 
distribution is а smooth continuous one, and may be looked 
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upon as a binomial distribution with a very large number of 
terms, very close together. Its actual formula, when measured 
from the middle, is 


A 1 29080) 

y even 
where e is the base of Naperian logarithms. 

At this point it is desirable to say that throughout this book 
these two assumptions—that the distribution of each variate 
is normal and that it is measured in units of its own standard 
deviation—are made. Some of the formule proved by using 
the geometrical model have, however, a validity wider than 
for normally distributed variates, and all of them are approxi- 
mately true when the departure from normal distribution is 
not great. As for measuring in units of standard deviation, 
this only means that if the variates are in practice measured in 
other units, it is necessary to change to the standard units 
before using the model to prove anything. Any formula can 
thereafter be changed back into any arbitrary units. 

In the above scattergram outlined in Figure 1, if the two 
variates are scores in two mental tests, the points like P in the 
figure represent persons who get scores equal to their co- 
ordinates. Thus P represents a man or a child who scores 
OX and OY in the two tests. The person P, and every person 
whose point is in the north-east quadrant, is above average in 
both tests. Those in the south-west quadrant are below 
average in both. Those in the other two quadrants are above 
average in one, below average in the other, test. 

Along each of the rectangular co-ordinate axes x and y there 
are really two coincident lines, one of them representing the 
variate (the test score), the other separating the people who do 
well in the other test from those who do badly. As long as 
the test-lines are at right angles to each other, these two lines 
remain coincident. But the change in the diagram about to be 
described will cause them to separate. We are about to 
compress the crowd of persons along the major axis until 
their contours of density are no longer ellipses, but circles. 
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Fic. 2 


Before doing so we note, 
formula x? —2rxy +y2= 
axes are in the ratio 


what is easily proved from the 
constant, that the major and minor 


om Е (2) 
major axis 4/(1 +r) 


he minor axis, Imagine this 
Persons before it until, when 
of density have changed from 
ellipses into circles—for a similar compression towards O 
must be envisaged from the other end of t 
This double compression (which must beu 
on suitably for every elliptical 
OB to OD, and OC to ОЕ. 
OA?=1+r and OE?=1 = 


OF? = OE? + EF? = Op? +ОА?=2 
апа 


=1=-7 


cos DOF =2 cos? EOF - 1 =21 
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А These lines OD and OF are still the lines which separate 

above average’ from ‘below average’ persons. All the 
persons who were in the quadrant BOC, and therefore good in 
both tests, are now in the wider sector DOF. We still want 
their scores in the tests to be represented by the vertical pro- 
Jection of their points on to the test-lines. It is therefore 
evident that the two test-lines must be at right angles to OD 
and OF respectively, in the dotted positions shown in Figure 2, 
and the angle between those dotted lines (the test-lines) is the 
supplement of DOF and its cosine is 7, the correlation 
Coefficient. 

We have changed from a diagram in which the existence of 
correlation is shown by the ellipticity of the density contours 
of points representing persons, to a diagram in which the crowd 
of points representing persons is circular in contour, and in 
which the correlation between two tests is shown by their 
lines being, not at right angles, but at an angle whose cosine 
is r. 

It will be noticed that by compressing the crowd in the way 
we have done, namely inward along the major axis, we have 
altered the scale of the diagram, so that all standard deviations 
have sunk to below unity. We ought really to have made 
our change from ellipse to circle by partly compressing the 
major axis and partly stretching theminoraxis: butthat would 
have unduly complicated Figure 2, and we can attain the same 
end result by imagining the circle expanded until the scale is 
once more unity, à procedure which will not change any angles 
and in particular will leave cos UOV-r. Inthe diagram then 
reached, each test-line can be looked upon as a ‘vector’. А 
vector is a direction with а weight or strength attached to it. 
In our model the weight of each is the standard deviation, in 


the full test-line unity. 


If the number of persons above average in both tests be a, 


the same number will, under our assumption of normal 
distribution, be below average in both; and b will be above in 
one and below in the other, with an equal number 5 below in 

the other. Those above average in both 


the one and above in Н : 
are, in our Figure 2, all those in the sector DOF: and in this 
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circular crowd it is evident that the angle DOF is 5 
while the supplement to DOF is n But the angle UOV 
between the dotted test-lines is the supplement of DOF, and 
its cosine is r. So we have Sheppard’s formula 


b 
r=cos r 
a+b 


in all three tests) will represent a test whic 
combination of the first three, 


A fourth test will in general project into a fourth dimension, 


and a model of n tests will be in »; dimensions. The crowd of 


г” u 


SECTION TWO 
SELECTION 


Уйтн the aid of a three-dimensional model let us first examine 
the changes in correlation coefficients produced by a selection 
which reduces the scatter of a variate. In Figure 3 let OA, OB 
and OC be the lines representing three tests, and let us call the 
angles AOC, BOC and AOB, a, B and у. Suppose that in test 
(variate) 3 a selection of persons is made such that their 
standard deviation on scores in that test is reduced from unity 
top, What is the effect on the correlations between the three 
tests? Let us suppose in the first place that the average score 
in test 3 is not altered by the selection. x 

Let, ACB be a plane at right angles to OC, and take OC as 
unity. Then the selection will result in the plane ACB being 
replaced by one parallel to it in the position A’C’B’ where 
OC’=p,. A corresponding movement towards O must be 
imagined in the upper part of Figure 3. 
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Before these selections and resulting changes in the diagram, 
the correlation coefficients between the tests Were 713 = COS ү 
Tw, =соз В, and rj,—cos y. The new correlation coefficien d 
are the cosines of the angles A'OC' or a’, B'OC' or ß', pe 
A'OB' or y'. Let us write р*+4?=1 and find the values, 
the figure, of p and q for each test. We have 


OA' OB’ 


рз=ОС', Pi= суд» P:= 0B 
Further, 
OA*-OA?=0B*-OB?=0C*-0C2=q? O 
Also 
, OAt-OA? qè o, 
BT оде "Олз cota | 


and similarly q? = qs" cos? В 

That is, q, = чз and qa =з. It is test 3 which has Er. 
directly selected. The other two tests have had their scat. 
indirectly reduced as a result, and their g’s are got from 43 ыы 
multiplying it by their correlation coefficients with the directly 
selected test. 


The correlation between the directly selected test 3 and 
test 1 is 


'oc-9€ ^ p» 
cos A'OC =A = ОА a cos a 


i.e. the new "s 
Similarly P 


The correlation between the two indirectly selected tests 1 
and 2 is a little more complicated. We have 


AB?—OA?-- OB? -20A.0B COS y 


and A'B’? = ОА + OB'* —20A' .OR' cos у' 
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whence, using equations (3) and (4), we get, since AB=A’B’, 


feng gpa eth Е 
cos у' OA.OB cos y - 9з? УТОА OB cos у= 
OA’.OB’ я Dips РЕ А 
Le. the new ras SL — Qd» 6) 


Pips 


This more general formula (5) includes as a special case the 
formula for the partial correlation 732.3 where the selected 
test 3 is restricted to one value and its standard deviation is 


Zero. That is, pg =0 and q=1. Then 


a Im 8} from equation (4) 
„= 


and formula (5) becomes 


Гуз — 713723 (6) 


паз (1 — r3) — re) 


We have so far assumed that the selection of test 3 was such 
that its average was unchanged. In our Figure 3 we have not 
altered the position of O. Buta study of Figure 3 makes it 
evident that if instead of moving the plane ACB up towards O 
we had moved O down by the same amount towards ACB, 
to a new position О’, we would not have changed anything 
essential, and could with practically the same trigonometry 
have arrived at exactly the same formula (5). Or we might 
have moved ACB a little up and O a little down, until the 
distance between them was again рз. In short, provided the 
distributions remain normal, a change in the mean caused by 
selection still allows formule (5) and (6) to be used. And in 
fact, even if the distributions are not quite normal, these 
formule have a kind of average validity. 


SECTION THREE _ 
PROJECTION ТО LINES, PLANES AND SPACES 


THIS section is devoted to the statement of a very ар 
Principle of geometry which will be frequently used E adj 
Pages which follow. ` Most readers will probably be a "i E 
familiar with it, but a few may not, especially in hig 


By the Projection of a Point A on to a line XY is пела 
Point С at the foot of the perpendicular from A on to he 
(see Figure 4). By the projection of a distance AB on to 
ine is meant the distance CD in the figure, 


CD =AB cos 0=AB sin $ 


The lines AB and xy need not be in the same plane. If not, 


the angle 0 is the angle between AB and a line parallel to XY , 
and cutting AB 


The lines forming the on 
Сап trespass into а space of many dimensions, provided eac 
One begins where the Preceding one ended. A point can also 


angles to the plane of the table-top and th 
is a right angle. Or it can be pro 
thus. Take any point O in OE and join it to P and to D. 


Then by our Construction the angles DPO 
Tight angles, 
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Fic. 4 
Therefore OD? = DP? + OP? 
and OP? = OQ? + QP? 
i.e. OD? = DP? + OQ? + QP? 
But DP? + QP? = DQ? 
therefore OD? =DQ? + OQ? 


and so OQD must be a right angle 


The principle here seen in three dimensions is also true in 
many dimensions. If instead of the table-top we imagine a 
space of n dimensions, and a point D outside that space, then 
D can be projected on to that z-space at a point P (for a 
perpendicular from outside a space on to the space hits it at a 
point and goes through it at once: the space has no thickness 
in that direction). If then P is projected on to a subspace of 
m dimensions inside the n-space (m < п), it will hit that sub- 
space at the same point Q as though D had been projected 
directly on to the subspace. Forn= Запі m =2 (and 1) this is 
illustrated in Figure 5, where by a kind of superperspective 


four dimensions are portrayed on the paper. 
А 3-space is defined by the lines OU, OV and OW (which 


20 THE GEOMETRY OF MENTAL MEASUREMENT 


may be three test-lines). D is not in that 3-space but outside 
it, and DP is a perpendicular to the space. From D also 
a perpendicular DQ is 
drawn to the plane 
defined by OU and 
OV. Then join PQ. 

AQ is any line 
through Q in the plane 
UOV, and A is then 
joined to P and D. It 
has now to be shown 
that PQ is at right 
angles to the plane 
UOV. (The angles 
don't look like right 
angles, but that is 

Fio. 5 because of the super- 

spective. In ordinary 

perspective right angles don’t always or indeed usually look 
like right angles.) 

Because DP is at right angles to the whole 3-space OUVW, 
the angles DPA and DPQ are tight angles: and because DQ 
is at right angles to the plane UOV, DQA and DQR are right 
angles, 


Therefore DA?=DQ? + QA? 
and also =DP?+PA? 
Further, DQ? - DP? + PQ? 
so DP? + PA? — (DP? + PQ?) + ОА? 
ог. РА? =РО? + QA? 


so PQA is a right angle 


And since AQ was any line in the plane UOV through Q, 
PQ is at right angles to that plane. The same point Q is 
reached whether D is projected directly on to the plane UOV, 
ог first on to the space OUVW and thence on to UOV. 

Similar reasoning shows that R, the projection of Q on to 
OU (m —1), is also the projection of D and of P. 


S.C.E.R.T., West Benga) &' Library М 
Acc. No..........-. ener * $ s 
x Calcutta ò 
SECTION FOUR DIR 
. ©. 


HIGHER SPACES 


SPACES of more dimensions than three are of course only a 
manner of speech. There is no suggestion that such spaces 
actually exist in the sense our familiar three-dimensional 


space exists for us who live in it. But with a little care, and 
some acceptance of new features, many of the geometrical 
laws we know in our familiar space are still true in higher 
dimensions, and it is convenient to use the familiar terms. 
For example, a sphere is a surface everywhere the same distance 
from a point called its centre, and its equation is 


x2+y?+2z° —radius? 


five dimensions, we still give the 


In a higher space, say of 
e, to a surface whose equation is 


name sphere, or hyperspher 
2 رج‎ + 2 +۷ + w? = radius 

n higher space are rather startling 

in a seven-space one can have a 

d another of three dimensions, 


Some things which happen i 
to the layman. For instance, 
subspace of four dimensions, an 
Which are completely orthogonal to one another, so that every 
line in the one space is at right angles to every line in the other. 
Such completely orthogonal spaces have only one point in 
common with each other. The analogue in our ordinary 
Space is a line (of one dimension) perpendicular to a plane (of 
two dimensions). The sum of the dimensions of two such 
orthogonal spaces cannot be greater than the number of 
dimensions of the space containing them. 

An instance at first sight apparently contradicting this is . 
two planes at right angles to one another in our familiar 
three-space, like a drawing-board standing on edge on a 

21 
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table-top. They have a whole line in common; and the sum 
of their dimensions is four, greater than three. But these two 
planes are not completely orthogonal. Many lines can be 


SECTION FIVE 


THE COSINE LAW 


IN a three-space we can, given a set of three co-ordinate axes, 
define a point P (see Figure 6) by its three ordinates, x, y and z. 
A line from the origin O through P we can define by the cosines 


of the three angles it makes with 


are called its direction cosines. 
unity, then the co-ordinates x=OA, y=AB and z=BP are 


also the direction cosines O 


their squares is unity. 
OB? + BP? = OP? = unity. 


the co-ordinate axes. 
If the distance OP is taken as 


These 


f the line OP. And the sum of 
For ОА*+АВ*=ОВ*, and then 
In a higher space we can again 


define a direction by its direction cosines, whose squares 


there too sum to unity. 


If we have another line OQ (OQ —uni 


ity) with its direction 


cosines OC, CD and DQ, then the cosine of the angle POQ is 


equal to 


We call this ‘forming the 
inner product of the direc- 
tion cosines’. Each is 
multiplied by its opposite 
number, and the products 
are summed. : _ 

The proof is almost self- 
evident. If OP is projected 
on to OQ (see Figure 6), 
then since OP is unity, 
OR is the cosine of POQ. 
Instead of projecting OP 
itself on to OQ, project the 
chain of lines OA, AB and 


OA xOC - AB x CD +BP x DQ 
2 


Fic. 6 
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BP by multiplying each by the appropriate direction cosine of 
OQ. Thus 


OR=0A xOC+AB x CD + BP x DQ 


If in another notation we call the angles OP makes with the 
axes а, В and y and those which OQ makes a’, В’ and y”, then 


cos РОО =cos a cos a’ +cos В cos B' cos y cos y” 
It is clear that the above proof would still apply in any number 


of dimensions. The Inner product of their direction cosines 
is the cosine of the angle between two lines. In two dimensions 


this is the familiar formula 
cos (A-B)=cos A cos B +sin A sin B 


SECTION SIX 


CENTROID-FACTORS USING UNIT 
COMMUNALITIES 


IN factorial analysis the matrix of intercorrelations of a number 
of variates is often first ‘factorised’ by a process known 
variously as ‘centroid analysis’ and ‘simple summation’. 
The latter name reflects part of the arithmetical procedure, 


1:0 3 
on 1:0 :8 
3 8 1:0 


2:0 2:5 21 =6:6=2:5692 
"779 973 +817 
I la ГА 


the former reflects the fact that the variates are being treated 
as ‘vectors’ in a space equal in dimensions to their number. 
If the matrix of correlations is completed by inserting unity in 
each diagonal cell (“unit” communalities), the arithmetical 
Process for extracting the first ‘factor’ is illustrated above, 
for three tests or variates only. The columns are summed, 
giving the values 2-0, 2-5 and 2:1. These sum to 6:6, whose 
Square root is 2-569. The column sums, divided by this 
last quantity, give what are called the loadings, or the satura- 
tions, of each test with this first factor. They are the correla- 
tion coefficients of each variate with this factor, itself a 
hypothetical variate. N 

In geometrical language each of the intercorrelations is the 
cosine of an angle between the lines representing the two 
variates or tests, and the loadings are the cosines of the angles 
each test-line makes with 59 resultant ог centroid—hence 
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U 


Fic. 7 


the name centroid method. Our first purpose is to show that 
the above arithmetical process does correspond with this 


geometrical picture. We shall deal first with the case of three 
tests. Let their intercorrelations be 


1 cosa cos В 
cosa 1 cosy (7) 
cosB cosy 1 


and let these three tests be Iepresented in Figure 7 by the three 
lines U'OU, V'OV and W'OW with the angles а, В and y 
between the pairs. (Here, as in most later diagrams, the 
figure is drawn on the positive halves of the test-lines, and 
acute angles are used between the latter, corresponding to 
positive correlation coefficients. But these restrictions are 
only for convenience in drawing the figures.) 

The resultant of three equal forces along these lines lies 
along the diagonal OP of the parallelopiped shown in the 
figure, with sides each equal to unity. Let the angle with OP 
made by each test-line be $, 0, w, so that the correlations of the 
tests with this resultant or centroid are cos $, cos 0 and cos w. 

Now project OP on to OU, first directly, and secondly by 


فا 


ق 
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projecting the chain of lines O. is gi 
the equation ] TE 


OP cos ф= ОА +АР” cos a+DP cos В 
=1 +cos a+cos В 
and similarly 
(8) 
OP cos 0 —cos a+1+cos y 
OP cos w=cos В +соѕ y +1 


From these we see that 
cos $ -- cos 0 -- cos w Өш ана = " __ 


for this expression is the pro- 


But cos ф +cos 0--cos w = OP, 
lines OA, AD, DP. We have 


jection on to OP of the chain of 
therefore that 


OP? —Sum of all the items in the matrix 
(=6:6 in our arithmetical example) 


and from the first equation of (8), 
1+cos a+cos 
cos ф Peik 
of the first column of the matrix 
Square root of the total sum 


simple summation. The 
fficient of test 1 with the 


_ Sum 


which is exactly the procedure of 


quantity cos ¢ is the correlation coe 
centroid, the loading or saturation. 
dimensions only, but clearly the 


Our diagram is in three 1 ) 
mber of dimensions. 


proof is general in any nu n 
to the arithmetical example. The load- 


Let us now return " 
the first centroid factor having been 


ings of each test with r 
obtained (let us call them 4, la, 15), each entry in the original 


matrix of correlations is reduced by the part explained by that 
first factor. Thus ras is reduced to ra; = lls (in our example, 
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'8 is reduced to -8--973 x :817—-005). The units in the 
diagonal cells are reduced by the square of the correspond- 
ing loading; the first one becomes, e.g., 1-000 — 779? = -394, 
Thus we arrive at the matrix of ‘residues’: 


1-h* ra-hh ra-hi "394 —.058 —-336 
ШЫ 1-h? r 1L |=| —-058 -053 -005 
з-га 1-12 —:336 005  .332 


It will be seen that each column sums to zero. 


Fic. 8 


We must now see what, in the geometrical model, corre- 
Sponds to this arithmetica] procedure, and to the fact that these 
columns add Up to zero. Let us deal with the case of three 
tests first, to enable ordinary diagrams to be made (Figure 8). 


between each and OA, and since the lengths OL, etc. are unity, 
these cosines are the distances OA for the first test, OB for 


| 
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the second, and a similar projection (not shown) of O 
OBA. OA =], and OB ET J : ) wa 
Removing the influence of the centroid ‘factor’ means 
projecting the test-lines as shown on to a space of one less 
dimension, here in this simple case on to a plane through O 
at right angles to the centroid, and dealing only with their 
components in that plane. These components have weights 
ae to the lengths OC, OD and OE, and with these weights 
a, are in equilibrium, for that is the nature of a centroid, that 
ere is no sideways pull. The angles between the lines on the 
plane can be found by the formula for partial correlation 
(equation (6), page 17). Thus the cosine of DOC is 


cos DOC cos KOL -cos KOA.cos LOB 
~ а – cos? KOA)(1 - cos? LOB)} 


тә Ш . 2 2=1-P 
EE if we write k?=1- 1 
Similarly formule are obtainable for cos DOE and cos EOC. 
Now since the three ‘forces’ (as we may call them to recall the 
laws of combination of forces) OC, OD and OE are in equi- 
librium, the projections of OD and OE on to the line OC must 


just balance OC. That is, the sum of 


OC or Kı 
Ta hls 
OD cos DOC or kə ME. T 
1%2 
rs- hls 
and OE cos EOC ог kx py 
1^3 


must be zero. If we multiply each of these quantities by ky 
their sum will still be zero: that is, 
k? 
rıa — hls 


гз = Мз 


sum to Zero 
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But this is exactly the first column of the matrix of residues 
(page 28). А 4 

This argument also holds in a space of many Amen 
say n dimensions containing 7 test-lines. If unities are изе 
in the diagonal cells of the n xn matrix of correlation co- 


direction defines one of n Iectangular co-ordinates. At right 
angles to the centroid, through the origin, is an (n - 1)-space 
in which the n remainders or components of the test-lines exist. 

Any two of the original test-lines, say tests 1 and 2, and the 
centroid, define a three-space, and the two “remainders are 
in a plane at right angles to the centroid, a plane which forms 
part of the (п –1)-ѕрасе. To this three-space the previous 


argument applies unchanged, and the cosines of the angles 
between all the ‘remainders’ 


remainders are not now in a two-dimensional plane. They 


are, however, in equilibrium (a centroid being what it is), and 
when projected on to one of thei 


and multiplied by kı, they give 
of the table of residues, which therefore 


dimensions until y ort 
oblique test-lines, 


SECTION SEVEN 


CENTROID-FACTORS USING MINIMUM 
COMMUNALITIES 


THE preceding section dealt with the case where unit entries 
are made in the diagonal cells of the matrix of correlation , 
Coefficients between the tests. The whole process is then 
Conducted in the test-space, i.e. the n-space defined by the 
^ test-lines. It is, however, the common practice to insert 
fractions called communalities in those diagonal cells, with 
the hope of thereby abbreviating the process of extracting 
Centroids, since the diagonal cells are exhausted sooner, and 
by a suitable choice of the inserted communalities it can be 
arranged that the other cells also vanish then: or at least that 
they become small enough to be disregarded. We shall 
assume, in our geometrical considerations, that they actually 
vanish. It does not concern us here to ask how these 
communalities which have this result are discovered: various 
Ways of first guessing them exist, and the guesses are afterwards 
refined and improved. The communalities are made as small 
аз possible without causing imaginary quantities containing 
М/(— 1) to arise later in the arithmetical work. 

Now it is obvious that in general a space defined by z test- 
lines cannot be defined by a smaller number, say c, of ortho- 
gonal centroids, and indeed it will appear that the c-space of 
the centroids (called ‘common’ factors) is not in the n-space 


of the tests at all. 
Further, these c common factors do not account for the 


whole of each test but only for the communality. The balance 
to make up unity in each diagonal cell is still unaccounted for, 
and и so-called ‘specific’? axes are added to the с centroid 
axes to do this. Each of these specific axes is at right angles 
to all the test-lines except one, А Since the communalities were 
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made as small as Possible, the specifics were thereby made as 
large as possible, TUM 

We then have a Space of n-+c dimensions divisible into a 
common-factor-space of с dimensions and a specific factor 
Space of л dimensions, The test-space, also of dimensions, 


1 Ne 

Tis 1 
If, however, we insert rj, into each diagonal cell in pd 
unity, the arithmetical process described on page 25 gives 


loadings with the first centroid factor the values 
h=h=Vr,,, thus: 


Zi 2ny-4n- QV 
loadings Мт Vr 


The residues left 


after removing the influence of the first 
factor are 


nha-(Vn3* һа (У) 
712 – (V? гә – (Vr? 


thatis, zero: and the process stops after one common factor 


is extracted, 


In Figure 9 the two tests are represented by the test-lines 
and OV, with an angle UOV whose Cosine is г. The 
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V G U 


angle whose cosine is ry, the 
line OG does not lie in the 
plane UOV. 

(In passing, the reader will 
no doubt remark that OG, not 
being in the plane of OU and 
OV, is not their centroid. Of 
that, more below.) 

The common-factor-space 
here is of one dimension only, 
and is clearly not in the test- 
Space, which is the plane ООУ. 
Two more axes must be added 
to OG to enable points in the 
test-space to be described (by 
three co-ordinates), and these, 


not common factors, have to be OS; perpendiculai 
of GOV, and OS, perpendicular to the pla 
lines OG, OU and OS, are all in one plane, 
OV and OS, all in another, perpendicular 
ABC and the cross-hatching are merely à 


drawing look solid.) 


In the more general case of 7 tests, 
used, each test-line can be loo 


ordinates; c in the common- 
that test. 
es whose cosines are its loadings 


Specific axis belonged to 
co-ordinates it makes angl 


with the successive factors obtaine 
exactly the same way here as on 


process, which is carried out in 


page 25, except that communa 10e 
If, following custom, we call the 


“battery” of three tests /?, h and hy?, the 


in the diagonal cells. 


communalities in а 
process, for the first factor, 1s 
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if they have to be specific and 
r to the plane 
пе of СОП. The 
and the lines OG, 
toit. (The points 
dded to make the 


when communalities are 
ked upon as having с+1 co- 


factor-space and 1 along the 


With the c communal 
d by the simple summation 


lities have replaced the units 


hj Let) Tis 
rı2 hy? Tos 
713 Tos hs 
ра? + тз + Tas hg? + Гуз Tos hs? + газ га =T = 


с 


34 THE GEOMETRY OF MENTAL MEASUREMENT 


and the loadings are the column sums divided by 
EI + hy? + hs? + Wyo + 2ryg + 2re3) 


Now this first factor, though not the centroid of the test-lines, 
is the centroid of their communal components in the common- 
factor-space, as уе must Shortly show: but first there are TA 
preliminary points to clear up—the geometrical meaning ae 
the square root of a communality, and the size of the КЕЕ 
etween the communal components of the test-lines TM 

We shall show that Л, is the projection 
Оп the c.f.s, (a convenient abbreviation) of unit distance ао 
the test-line of test i; and that, if ф be the angle between tes 


Ines 7 and j, the angle 0 between h; and h; has as cosine the 
value 


Los $ ry (9) 
созбе hh; hih; 


n the more general case, with numerous tests, it is unlikely 
that communalities can be found to reduce the number of 
centroids to one. The common-factor-space will have, say, 
¢ dimensions, A test-line OU is not in this с-зрасе, but sticks 
Out into a further dimension. We can describe OU by means 


Of с+1 axes, namely the с factor axes in the c.f.s., and its 
Specific axis; and its directi 


э 


there are two common factors. These two common 
factors, and the Specific of test i, define a three-space, shown in 


ү 
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lane XOY. The co-ordinates of 


of OU, at right angles to the p 
d CA, the loadings of test 7 


A on these three axes are OB, BC an 
on the two common factors and its specific, or, in other words, 
the direction cosines of OU. The distance OA projected on 
to the plane XOY gives the distance OD; whose co-ordinates 
on that plane are OB and BD; equal to two of the co-ordinates 
of A, the two common factor loadings of the test OU. If we 
call the three loadings /, m and s (see the figure), then 
P4m+s=1 
and OD} =P +m? ЕЙ” 


for the arithmetical procedure, which consisted in subtracting 
from the communality first № and then m?, by hypothesis has 
exhausted the communality and reduced it to zero. So 
OD,=h;. The quantities h for each test are the projections 
on to the common-factor-space of unit distance along each 
test-line. Clearly this, shown here only for the case of two 
common factors, is general for c common factors. In the 
с+1 space the loadings of OA still, when squared, sum to 
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unity: and its projection on to the common-factor-space is 
the square root of the sum 
E of squares of the common 
factor loadings. 

Next consider the size of 
the angle 0 between two 
lines like OD, the projec- 
tions of two test-lines OU 
and OV. Two such are 
shown in Figure 10, OD; 
and OD,, and the specific 
axis OS; of the new test is 
indicated by a dotted line. 
OV, the line of the new test 


S; itself, is not shown. It is 
I in a different three-space, 
Fic. 11 5,0ХҮ. The whole dia- 


for OU hi cosO0 hh, sin 0 Ма =1, 0 
for OV h; 0 0 VARA . 
The inner product (see page 23) of these four direction cosines 


gives us the value of cos ¢ (between the test-lines), 
cos d=h;h; cos 0 
whence cos 0 =соѕ #/h;h; —rul[hh; 


of the tests. Take first t 
(Figure 12), and, for convenience in drawing the diagram, 
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three tests—but the argument which follows would apply to 
any number. 

Take as three co- 
ordinate axes OS, 
(the specific of test 
1), OL along the line 
of A, in the с.Ёѕ., 
and a line OM in 
the c.fs. at right 
angles to h,. The 
figure has been 
decorated with a 
door a window, 
and a picture to 
make it look solid 
(Figure 12). 

Since the test-line 
is in the same plane 
as h its own 7 з 1 1 
и йе on the floor of the room a P. E 
distance along it. OP =h КУТ 3 E arallel to Л. 
parallel to A,, and then from Q draw QE as E De hy, ha 
OR is then the centroid of the communal components la, La 
and hy. (If there had been more tests we chain.) We have 
m Din and R would be at the end of the chain. 


i AOR, between the test-line 
E А Im ae as the loading of test 1 on 


troid of the h’s, is t h o 
de A Ker ОБОДО by ы БОЕ ad Q 
d K, and pr : | 
OE ren is a right angle (see Seen an 19), 
and ОС 8 therefore the cosine we seek. We 


р ОВ 
cos AOR =ОС =); cos POG= hoR 


h,2 +h cos a +h,hs cos B 
MATEO = E ла COS ы OR 


Fic. 12 


fy? tris trs 
— OR 
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The numerator is the s 


um of the first column in the summation 
Process applied to 


2 
hy Tie Tis 
Tia h? Tos 

2 
Tis Tas h? 


It remains to show that ОВ? is the sum of all the items in this 
matrix, thus: 


OR? - OB? + BR? 


=(Л +h cos a +h, cos BY + (ka sin a +h; sin py 
Sh? +h + hg? + 2, cos a+ 2h,hy cos В 


centroid of the communality 
R components, and the same for 
each other test, 


bs The above proof is for a 


common-factor-space of two 

dimensions: but the same form 

of proof serves for any number of 
Р В 


c.f.s. dimensions. Figure 13 is 
Fic. 13 for three common factors. The 


Cosine required is stil] the projec- 
f.s. is of three 


- «if more tests) 


and OR? will still be found by the reader (to whom detailed 


working is left) to equal the sum of all the items in the согге]а- 
tion matrix with communalities in the diagonal. 


SECTION EIGHT 


MULTIPLE CORRELATION AND PARTIAL 
REGRESSION COEFFICIENTS 


WHEN we know the correlation coefficients between а number 
Of variates xo, Xi, X2- - · Xn taken in all pairs, it is possible 
to find weights for the variates x; 10 X» so that the correlation 
between their weighted sum and the other variate xo is à 
maximum. This maximum correlation is commonly called 
the multiple correlation, and the weights, partial regression 
coefficients. (We shall assume the variates to be standardised. 
If they are not, the coefficients we speak of will require each 
to be divided by the standard deviation of its variate.) 

When the variates are the scores in mental tests, хо is usually 
something the experimenter wants to predict, such as secondary 
school success predicted from a battery of tests given before 
entrance to the school. Itis often referred to by psychologists 
as the ‘criterion’, or sometimes as the “predicand” 3 

Figure 14 portrays the case of two tests x, and x, and one 
predicand хо. The lines for x, and x, are drawn ona table-top 
to assist in visualising the fact that the line for хо is not In their 
plane. A weighted combination of х1 and x, is to be found 
which will be as near as possible to x,—for the smaller the 
angle with хо the larger the cosine and there! 
correlation. Now no combination of x, and x, can be else- 
where than on the table-top: and clearly the line on the table- 
top which is nearest to хо ог OD is the line OK, the projection 
of OD on to the table. If the distance OD is unity, we shall 

weights to give to x, and x, are the 


how that the proper : 
deua prope obtained by drawing the parallelogram 


distances OA and OB, о 1 
KAOB. The parallelogram of forces show us this. It is 


necessary then to show that шее distances (OD being unity) 


fore the larger the 
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are the same fractions as are given by the fo 
Tegression coefficients found 


rmule for partial 
by the method 
For two tests these are 


of least squares. 


Perpendiculars KE and KF 
to the two test-lines, and from E 


" drop a Perpendicular EG to 
OF. Then (by Section Three, page 19) the angles DEO and 
DFO are right angles, and 
OE =соз а =Го 


OF =cos Bros 


| 
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f > В 
or the correlation coefficients are represented by the cosines 


хо х хе 
Xo 1 cosa соѕ В 
x, !созіа. 1 cos $ 
x,  cosB  cosó 1 
Further, OG = ОЕ cos ф =с05 a cos $ 


so that GF =OF - OG =cos f - cos a cos $ 


But GF is the projection of KE and equals KE sin ¢, and KE 
15 the projection of OB and equals OB sin $, so that 


KE СЕ cosß-cosacos® го: Гоа 


“sing sid 1-cos? $ 1-75 


Similarly OA is the other partial regression coefficient. Thus 
the weights to be given to scores in two standardised variates 
(e.g. two tests) so that their, weighted sum will correlate as 
highly as possible with a third variate, are equal to the oblique 
co-ordinates, on the two test-lines, of a point projected from 
unit distance along that third variate on to their plane. 

This statement is also true for a battery of more tests, and a 


criterion. 
In our simpler diagram, Figure 14, the correlation between 
the variate хо and the weighted sum of x; and xy is the cosine 
OD is unity. That is, the 


of DOK, which equals OK since t 
multiple correlation is measured by ОК. If we project OK 


on to xo at OL, then OL measures the square of the multiple 
correlation. Now OL is also the projection of the chain OA 


and АК on to xo, that is, it is equal to 
OA cos a+AK cos В 
by roy + Da Foz 
ression coefficients b, and by. That is, 


or 
if we call the partial reg 
Гах = Di Го + Ва Гоз 


where шах means the multiple correlation. It is not difficult 
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fo see that such an equation will hold for more variates in the 
battery. For if K is the projection of D on to a battery space, 
and OK is projected on to the criterion line as OL, OL will 
Still measure the square of the cosine of the angle РОК. Also 
OL will still equal the sum of the projections of the oblique 
co-ordinates of K on to the criterion line, i.e. 


Tmax =by го + Ва гоз + bg Fog + TS (11) 


* * * * * * * 


The remainder of this section is for readers with more 
mathematical knowledge, namely а 
knowledge of the method of least 
Squares and the elements of matrix 
algebra. Other readers may neverthe- 
less wish to read the geometrical dis- 
cussion of Figure 16 (pages 45-47). 
Figure 15 is intended to represent 
+1 dimensions, for a battery of п 
tests and a criterion, unit distance 
along which is represented by OD. 
When OD is projected on to the 
n-space of the tests we get OK, and 
Fio; 15 OK is projected back on to the 
| А _ Criterion line giving OL. The point 
К Ваз oblique co-ordinates along the test-lines measuring 
О, 


in length. Beginning at O, a chain of T 
in length, and parallel to them in direc 
chain will end at K. 


Let the correlation coeffi 
each other be 


ines equal to these b’s 
Чоп, is built up. The 


cients of the tests and criterion with 
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Гот 1 Fiz Pin 
Гоз rı2 1 | 
m 
Ton Tin | 
and let | В | 2 4. 
Let these correlations be represented by 
1 cosa со созу... 
cos а 1 cos ¢ cos... 
cos В cos ф 1 cosh... 


cos у cos 0 cos у 1 


Further, let a be the leading entry in the reciprocal of the 
matrix В, i.e. 4 
00 
E 9 
а= (12) 
where Ago is the minor of the first item in R. Then it can be 
shown that the distances OL, LD, LK and KD in Figure 15 


have the values there shown in terms ofa A 
We can now show that trical considerations in the 


figure lead to values of by, bs Ps ° - ° etc. identical with those 
found by the method of least squares algebraically. | We shall 
write the work out for n =3 only, but it is clearly quite general. 

The projection of OK on to the line of test 1 is the same 
as the sum of the projections of its chain of b’s. It is also, 


geome 
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by the principle of Section Three (page 19), equal to the direct 
Projection of OD on to that line. That is, 


cos a=b; +b, cos d +b, cos 0 

Similarly cos B=b, cos $ +b +b cos y 

and соз у =b, cos O +b, cos y + b, 
In matrix form, 


COS a 1 cosd соѕ 0 by 
cos B |=! cos ¢ 1 COS у b, 
COS y cos Ө созу 1 bs 
by reciprocal of COS a 
whence b, |=| the above cos В (13) 
b, matrix COS y 


But this is exactly the same solution for Dı, b, and b; as 
is found algebraically by the method of least squares. We 
wish by that method to find values for b,, etc. which will, as 


d | A .^* Persons, give values identical 
with their scores w; in the criterion. This js impossible with 


only three (or n < N) constants at disposal in N equations. 
The best that can be done is to ensure that the difference 
between the criterion score and the weighted battery score is 
small in each case, say v. We then have N equations 


Wy — (by xı + y, b, 2)=v, for the first person 


Wa — (b1 Xa +5» Ya +b» 22) =v, for the Second person (14) 


and so on for N persons. By the principle of least squares we 
minimise 2. This leads to the three ‘normal есабы 


Zwx-b, Ex? — by Xxy - b, ixz=0 
Iwy —b, Uxy—b, Ey? -b, Xyz =0 
Zwz-b, Xxz-b, Xyz -b; Xz? —0 
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Test 3 
Test 2 


Test | 


Fic. 16 


and these equations are equivalent to 
ro 75i = bs тз — Ds 0j, 70 


гоз — b1 743 — b2- bs reg =0 


ros = Pi Ty — Ps rag =b3=0 
that is, to [ 5 1 з ns |] 70 
by |=] r1 1 T23 To2 
bs Tis T23 1 Tos. 


exactly the same equations as those arrived at by the geo- 
metrical argument, if we write cosines for correlation со- 
efficients. We have only written this out for a battery of three 


tests, but it is clearly general for и tests or variates. 
nables us to see just what happens when 


The model also € 1 
a variate is removed from 4 pattery, resulting in a decrease 
in the multiple correlation of the remaining battery with the 
criterion and a change in the artial regression coefficients. 
Figure 16 is for three tests, and the criterion line, not shown, 
is in a fourth dimension. As in other diagrams, ОК is the 
projection оп the test-space of unit distance OD along the 
criterion, and equals the multiple correlation. Its oblique 
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co-ordinates along the test-lines are the partial regression co- 
efficients bj, ba, bj. What happens to these if test 3 is removed 
from the battery, leaving only tests 1 and 2? In that case, OD 
on the criterion line would have to be projected on to the plane 
of these two tests at Q, and the new multiple correlation would 
be measured by OQ. But, by the principle of Section Three 
(page 19), Q is also the projection of K on to the plane. So 
the reduction in the square of the multiple correlation is 


OK?-0Q?=KQ? 


and the new partial regression coefficients are OP and PQ. 
The increases in the coefficients b, and Б, are therefore ST and 
TQ=SR. Now the figure KSTQR is-clearly a miniature of 
a figure like our Figure 14, for finding the weights to give to 
tests 1 and 2 in estimating test 3; and because SK =b; instead 
of unity, the lengths ST and SR are b, times these weights. 
That is, when a test i is omitted from a battery, the partial 
correlation coefficients of the surviving tests have to be in- 
creased by b, times their coefficients in estimating the omitted 
test, a statement true for a battery of many tests, not merely 
of three, as imagining a Figure 16 in many dimensions will 
show. 

If the intercorrelations of the tests of the battery are r’s, 
and the elements of the reciprocal matrix are c’s, then these 
increases are, for the abolition of test i, 

Ci 


DE 
Cit 


where b; is the partial regression coefficient of test i in the 
battery before its omission. That is, the coefficient of each 
test j is increased from b; to 


Cij 
bi- 2b, 
Cii 


by the omission of test i. (The quantity c; is commonly, 
though not necessarily, negative.) 
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The reduction in г? ах is in Figure 16 KQ?, and 


KQ?=KS?-SQ? 
= bj - (ST? + TQ? +28T.TQ cos $) 


and a little algebraic manipulation, using the formule in b’s 
and c’s for ST and TQ, and remembering that cos ф=ть 


reduces this to 


ded 07 

decrease in r max = = 

C33 
: b? 
or, in general, = 
ii 


SECTION NINE 
PREDICTION OF A MEASURABLE VARIATE 


LET us suppose, in the first place, that the scores of a very large 
number N of persons are known in n-+1 variates (or tests), 
that the correlation coefficients have all been calculated; but 
that the list of scores in the first test x) has been Jost. Can we 
reconstitute that list from a knowledge of each person’s n 
scores in the other tests, and of the correlation coefficients? 

The answer is that we cannot say what each of the N persons 
scored in the missing list. But for each group of them all with 
the same set of scores in the и surviving tests, we can say what 
the average score of the group was in the missing test list. If 
we now give to each member of this group that average score 
of his group in хо, then this will in half the cases be too big and 
in half too little, but it is the best we can do. We can also, by 
adding a ‘plus or minus’ quantity, indicate how the actual 
scores in the recovered list will be found to be scattered about 
this ‘prediction’. 

In this situation of a lost list “prediction” is.hardly the right 
word: but in practice actual prediction is usually required and 
is effected—though with still less certainty—by the same 
procedure. Imagine that all the correlation coefficients of the 
n+1 tests have been calculated from scores made by one set of 
persons, and that another set of persons is tested in the и tests 
but notin хо. We ask what they will score in x, when (perhaps 
much later) they come to take that test. Each of them can 
certainly be allotted a predicted or estimated score in хо, but 
it will only be the average of what he and others just like him 
in the n tests will later score, it will not be exactly correct for 
him individually. And here, where two different sets or 
samples of people are concerned, there will be an additional 
error unless the samples are сагу similar, 
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We shall consider, geometrically, the case where n=2, and 
point out that the proof is general for a battery ofntests. In 
Figure 17 test-lines are shown for two tests, OU and OV, and a 
‘criterion’ line OD for xo which has to be ‘predicted’ from 
scores in the two tests. The space is three dimensional, and 
the points representing persons are spherical in contour round 
the point O where the three test-lines cross. Р represents a 
person who gets scores OX and OY in the two tests—but note 
that every person whose point in the three-space 1s vertically 
above or vertically below P also gets these same two scores. 
From the two scores we cannot tell where a person’s actual 
point is on that vertical line. P is at the mid-point of their 
distribution along it, and to find the average score the group 
will make in хо we project P on to its line at Q, and then OQ 
is the score in question. 

The algebraic formula for the “best” prediction of a score in 
хо from scores in x, and x; (best in the least squares sense) is 


Ky =b1%1 + Daxa 


е equation (14), page 44), where the circumflex accent 


(compar ДИ 
hat it is not a measured but only a predicted 


over хо indicates t 
D 
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value. If our model agrees with this, then we ought to be able 
to show that (when OD is unity) 


OQ=0A x OX + OB x OY 


Call the angles UOV and UOP, ¢ and w. Then the right-hand 
side is 

OA.OP cos w +OB.OP cos ($ - w) 

=OP{OA cos о + АК cos (¢ —w)) 

=OP x projection of OK on to OP 

—OP x projection of OD on to OP 

=OPxcos DOP (since OD is unity) 

=0Q 


Moreover, this proof is clearly general for any number of tests, 
i.e. any number of dimensions in the battery space instead of 
the two dimensions of the table-top. The point K is always 
the projection of D on to the test-space, and the oblique co- 
ordinates of K along the test-lines are the b’s, the partial 
regression coefficients. The scores (like OX) are OP times 
a cosine. The cosine is transferred to the coefficient (like 
OA cos w), and these add up to OK projected on to OP, ¡.e. 
to cos DOP. So the value of the whole is OP cos DOP or OQ. 

We have in this section been considering the recovery of 
a lost set of test-scores in one test from a knowledge of the 
Scores in a battery of other tests with known correlations with 
the missing one and with each other. When the lost list is 
found, we can compare our predictions or estimations with 
it, and we find that we are only right in an average way. Half 
our estimates are too high, half too low. 

We have applied the same procedure to build up a list which | 
has been not ‘lost’ but not yet made. Again, when it is made, 
we can compare our predictions with the facts, and find this 
time that even our averages for similar subgroups may be | 
wrong, since the correlation coefficients must have been | 
calculated from а different sample of persons. 

In spite of these uncertainties, the method is the best we can 
do when the practical need to predict is great, and we can 
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temper assurance with ‘plus or minus’ added to each prediction. 
This range of uncertainty will be the narrower, the nearer the 
predicand or criterion line is to the test-space, e.g. in Figure 17, 
the smaller the angle DOK: or perhaps we should say, the 
nearer the test-space is to the predicand line, for it is the 
former we can change, by picking suitable tests. When the 
predicand line is actually in the test-space we can predict with 
certainty. 
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SECTION TEN 
THE ESTIMATION OF FACTORS 


A FACTOR is a hypothetical test, and its correlation coefficients 
with the actual tests of a battery are its so-called loadings or 
saturations in each test. These, for centroid factors, are found 
by the process of simple summation: and other factors can be 
found by rotating the centroid factors to new positions. 

The situation, then, if we want to estimate a man’s factors 
from his scores in a battery of tests, is exactly the same as when 
We want to estimate his scores in a further test, except indeed 
for the important difference that in the latter case we can give 
him the criterion test and check our estimate. We cannot give 
him the hypothetical factor test. It doesn't exist. 

When communalities are used, none of the factors, common 
or specific, is in the test-space. The method described in the 
preceding section can be used to estimate a man P's factors in 
exactly the same way as is there described for estimating or 
predicting his score in a real test. In Figure 17, OD could be a 
factor-line. Factor-estimations made in this way have the 
same advantages and disadvantages as estimates of a real test- 
score from correlated tests. They will give every one of a group 
of men who get the same test-scores, the same factor-estimates, 
although their factors no doubt differ round this as mean in the 
same way as with the real criterion, 

A factor can be considered as the resultant of two com- 
ponents, one in the test-space and one at right anglestoit. The 
Scores a man makes in the tests can tell us nothing whatever 
about this second component, which is absolutely uncorrelated 
with any of the tests we have given him. All we can do is to 
make some assumption about his ability in that direction, and 
the assumption made when we apply to a factor the method of 
estimation described in the preceding section is that in this 
unknown ability at right angles to the test-space he is average. 
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We take P (Figure 17) in the test-space as being actually his 
point also in the space of higher dimensions. It is important 
to stress the fact that when there are more factors than tests 
(as there are when communalities are used) nothing whatever, 
no mathematical device whatever, can do away with the need 
for making some such assumption. And the assumption of 
average ability, when ignorance is complete, is the proper one 
to make—we shall call this ‘assumption A’ (two others will be 
spoken of later). 
_ Someone may protest against this, saying that if P does well 
in the battery of tests he is likely to be above average in any 
new test, for abilities tend to be positively correlated. True; 
but not this particular ability, for it is at right angles to all the 
battery tests, uncorrelated with any of them. We have no 
reason at all to think that P is good or bad in this direction, in 
this component of his factor. Since more men are average 
than anything else, we had better assume P to be average. 
Method A, then, uses to assess а man’s factors exactly the 
same procedure as is commonly used in assessing а real and 
directly measurable quantity, the method described in the 


preceding section. 
Method B, proposed by Professor M. S. Bartlett, makes 
`а different approach, by minimising the influence of the 
specific factors. His working out of this is done algebraically, 
but it can, at any rate crudely, and perhaps exactly, be said to be 
equivalent to assuming, not that P is average in the unknown 
component, but that he has such ability in it as places his point 
as near as possible to the common-factor-space. | In the 
simple battery of two tests portrayed in Figure 17, this means 
that P is moved to P', where the vertical line most closely 
approaches the criterion line OD : and then О’, its projection 
on to OD, gives the value 99 Ne xb SEN, win 
i 1 e vertica! 
(In the solid Figure 17, th Е агы aS 


continuation of OD: for OD i 
When there are several common factors, and a c.f.s. of 


several dimensions, the problem is to find a position for P’ as 
near as possible to that common-factor-space, while still 


keeping P’P at right angles to the test-space. 
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For estimating a real measurable variate this procedure is 
quite inadmissible. For estimating factors, it has an advantage 
in the eyes of those who consider that it is improper to maximise 
the specifics, as is done by the common practice of minimising 
the communalities: for this Method B minimises them again 
at a later stage. But note that by doing so it has changed the 
factors. If smaller specifics, that is larger communalities, had 
been used at the outset, the common factors would have been 
different and there would have been more of them. 

Before discussing the third assumption made about each 
man’s ability at right angles to the test-space, it is necessary 
to devote a section to the consideration of the presence of 
correlation among factor-estimates even of uncorrelated factors 
when the estimates are made by either Method A or by B. 


SECTION ELEVEN 


THE CORRELATION OF FACTOR-ESTIMATES 


THE factors we have had in mind throughout have been 
orthogonal, that is, uncorrelated factors. Their lines in our 
model are at right angles to each other and form a rectangular 
co-ordinate system of the factor-space. Centroid factors by 
their method of calculation, and specific factors by their very 
definition, are orthogonal to one another, and so too are most 
Systems of factors derived from these by rotation. (Some 
modern systems use correlated factors, but these we have not 


50 far mentioned.) 


Yet when these uncorrelated factors are estimated their 


estimates are correlated! Essentially this is due to the fact 
that while the points representing persons are, in Method A, 
left in the test-space, the factor-lines are outside it. (The 
person-points in Method B are moved out of the test-space, 
and it seems probable that Bartlett estimates are less correlated 


than those made by Method А.) К Er 
In Section One it was em hasised that, in the original 
Scattergram, correlation between variates represented by lines 
at right angles was shown by the ellipticity of the contours of 
ersons—points such that 


density of the points representing р s 
their co-ordinates were the test-scores of the respective persons. 
If, in such a scattergram, the contours were circular, there 
would be no correlation. For a given score x, the average 
value of y would be zero. there would be no tendency for it 


to alter as x altered. nr i 
With a solid scattergram for three tests (their lines at right 
ints representing persons will be 


angles to one another), the ро ) s 
f there is no correlation present, 


spherical in density-contours i 1 pres 
ellipsoidal if there is. Such scattergrams can be imagined 


also for more tests, in more dimensions. 
55 
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In our Section One we changed this by certain compressions 
and extensions of the swarm of points until they were always 
spherical whatever the correlations. The test-lines were then 
no longer orthogonal but at angles whose cosines measured the 
correlation coefficients. The centroid factors described in 
Section Six, where full communalities of unity were used, are 
also in that test-space, and equalin number to thetests. There 
are no specific factors in that section. The factor-lines are at 
tight angles to one another, the swarm of points is spherical, 
and there is no correlation between the factors or between 
their ‘estimates’, which here indeed are not mere estimates 
but measures. The factors have no components outside the 
test-space, and it is immaterial what P’s ability is, at right 
angles to that space. 

It is different, however, as soon as fractional communalities 
are used, and specific factors required. The total number of 
factors is then greater than the number of tests, and the factor- 
space of more dimensions than the test-space. In that factor- 
space, the points representing the М persons whom we have 
tested are no doubt distributed spherically in density: but we 
do not know where they are, we only know their projections 
on the test-space. Those projected points, in that test-space, 
are spherical in density; but a sphere in n dimensions is, in say 
п +c dimensions, an ellipsoid with c of its axes zero. So the 
picture, in the factor-space, is one of ellipsoidally distributed 
points, and therefore of correlation between the factor-estimates 
if these points are used, as in Method A, to project on to the 
factor-lines for each man's estimated factors, 

If this explanation is unconvincing to a reader unaccustomed 
to thinking spatially in high dimensions, a consideration of the 
simple case portrayed in Figure 9 (page 33) may be illuminating. 
There only two tests form the battery, and there are three 
factors, G, S, and S, whose lines define a three-space. The 
test-space is a plane, partly shaded in the figure. Only the 
positive halves of each test-line and each factor-line are shown. 
The test-lines, for example, really go on below the floor of 
the ‘room’ indicated by the factor-lines. 

Now the density-contours of points Tepresenting persons 
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on the sloping plane which is the test-space are circles. But 
when these are projected on to one of the walls of the room, say 
on to GOS,, they become ellipses, and the estimates of the 
factors G and 5; are therefore correlated. In this there is no 
difficult spatial imagination demanded, for everyone knows 
that the shadow of a circular disc on a wall is an ellipse, if the 
disc is not parallel to the wall and the light perpendicular. 
_By analogy, the projection of the circular outline of an п 
dimensional sphere on to a plane not in that л dimensional 
space is also an ellipse, and thus correlation is created between 
the two factor-lines defining that plane. All this is due to 
using the points representing persons in the test-space as also 
representing them in the wider factor-space. Their points in 
that wider factor-space, did we know them, are really distri- 
buted spherically in it: but we do not know them, only their 


projections on the test-space. 


SECTION TWELVE 
UNCORRELATED FACTOR-ESTIMATES 


THE situation is best approached by again considering a 
battery of two tests. In Figure 17 (page 49) let U and V be the 
lines of two tests, and the line OD a factor-line. The table-top 
is the test-space, the factor-space is of more dimensions: for 
the moment let us think of it as a three-space. 

The very numerous points P (N in number) are on the table- 
top, and are circular in density, round О. Along any line on 
the table-top the points are distributed normally. At any 
One position like P in the figure, there will be not one point 
but quite a number (if N is large), being the projections of all 
the points situated on the vertical line P'PP", both above and 
below the plane. 

Now we do not know where these persons, concentrated at 
P (having all scored OX and OY) really are situated up and 
down the line P'PP”. But we do know how they are distributed 
up and down that line normally. We can therefore, if we are 
ruthless enough, allot to each of them a Position, moving them 
Vertically different distances away from the test-space (that is, 
making different assumptions about their abilities, although 
in the tests they are identical) until their distribution is normal 
with unit standard deviation. If we do this for each possible 
position of P all over the table-top, the N points, instead of 
being a disc-like crowd on the plane, will be a solid swarm 
round O, and their contours of density will be spheres. The 
process is just the opposite of projecting a solid sphere on to its 
equatorial plane. We are given the distribution over the 
equatorial plane and required to reconstitute the sphere. We 
can reconstitute a sphere, but except by a miracle the points 
will not be back where they were before the Projection, it will 
not be the sphere. 5 
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for clarity, rub out the parts of Figure 17 
Figure 18) and add a second and a third 


factor-line, OS, and OS,. In the figure, one of the circular 
contour lines on the table is shown expanded up and down into 
a sphere. Points on the table have been moved up, or down, 
different distances till their configuration in the three-space of 
the factors is spherical. When these positions are projected 
on to OG, OS, and OS; they will give values for the factor- 


estimates which are uncorrelated. y 
But it should be noted that this does not mean we can give 


factor-estimates to individual men. If Tom, Dick, Harry and 
others are all at one point оп the table-top (that is, have each 
the same pair of scores in the tests U and V), we know that some 
of their points in the three-space of the factors must be above 
the table, and some below; „and we can scatter them up and 
down in the correct distribution ; but we do not know whether 
Tom goes up and Dick down, or vice versa. We can arrive at 
a set of uncorrelated estimates, but do not know which man 
has to be given which estimate! This makes the procedure 
useless for any practical purpose, such as vocational advice. 
What has been illustrated from the crudely simple case of two 


At this point let us, 
we are not using (see 


ed. 
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tests and three factors is also true for л tests and n +c factors. 
The scores of М persons in the и tests give that number of points 
spherically distributed in the test-space. From the point of 
view of the n +c space, however, these points are ellipsoidally 
distributed, and estimates of factors made by projecting them 
on to the factor-lines are therefore correlated. The N points 
can of course be moved orthogonally away from the test-space 
in different directions and by various amounts until their 
distribution in the factor-space is spherical, and their projections 
will then give uncorrelated factor-estimates. But we cannot 
allot these estimates to individual men, so they are not of 
practical use. 

A brilliant paper by H. Kestelman in Vol. V of the British 
Journal of Statistical Psychology has shown, using matrix 
algebra, how the values of the uncorrelated factor-estimates 
can be calculated. It does not, of course, mean—though the 
unwary reader might mistakenly think so—that these values 
can be given to individual men. That would be quite 
unjustifiable. 

Another way of looking at the matter may make the situa- 
tion clearer. Suppose, in Figure 17, we had given to each of 
the several persons at P the estimate OQ. This set of such 
estimates would be correlated with estimates similarly made of 
another factor. But we could correct this by altering the 
estimates OQ until they were scattered, keeping the mean at Q, 
with the proper standard deviation, which is equal to the cosine 
of the angle between the factor-line OD and the perpendicular 
to the test-space, that is to the square root of (1-7r&4.). When 
this is done for every P and its Q, the N estimates along the 
factor-line will have a standard deviation of unity, and will be 
uncorrelated with those along other factors, 
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